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Abstract 


A reverse transcription nested PCR (RT-PCR) sequencing methodology was developed and used to generate 
sequence data from the spike genes of three geographically and chronologically distinct human coronaviruses 229E. 
These three coronaviruses were isolated originally from the USA in the 1960s (human coronavirus 229E strain ATCC 
VR-74), the UK in the 1990s (human coronavirus 229E LRI 281) and Ghana (human coronavirus 229E A162). Upon 
translation and alignment with the published spike protein sequence of human coronavirus 229E ‘LP’ (isolated in the 
UK in the 1970s), it was found that variation within the translated protein sequences was rather limited. In particular, 
minimal variation was observed between the translated spike protein sequence of human coronaviruses 229E LP and 
ATCC VR-74 (1/1012 amino acid differences), whilst most variation was observed between the translated spike 
protein sequence of human coronaviruses 229E LP and A162 (47/1012 amino acid changes). Further, the translated 
spike protein sequence of human coronavirus 229E A162 showed three clusters of amino acid changes, situated within 
the 5’ half of the translated spike protein sequence. © 1998 Published by Elsevier Science B.V. All rights reserved. 
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(Tyrrell and Bynoe, 1965; Hamre and Procknow, 
1966). They derive their name from their charac- 


1. Introduction 


Coronaviruses were first described as aetiologi- teristic ‘crown-like’ appearance in electron micro- 
cal agents of human disease in the mid-1960s graphs imbued by a fringe of club shaped spike 
when isolated from natural common colds (or peplomer) proteins inserted into the viral en- 


velope. Virions are lipid enveloped and are ap- 
* Corresponding author. Tel./Fax: + 44-116-252-2939; e- proximately S0e1 20) -tmi a diameter; there is a 
jail dsm@le asus single stranded genome of positive sense RNA 
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approximately 30 kb in length. They also have a 
characteristic replication strategy, in that the posi- 
tive sense genomic RNA is first transcribed into 
negative sense intermediate RNAs (by a virus 
encoded transcriptase) from which a nested 3’ 
co-terminus set of five to eight subgenomic mR- 
NAs (six subgenomic RNAs for human coro- 
naviruses) are transcribed. The subgenomic 
mRNAs have identical 3’ ends but extend for 
different lengths in the 5’ direction (Lai, 1990). 

Antigenically, coronaviruses may be divided 
into two major serogroups and one minor 
serogroup. The two major antigenic serogroups 
are designated coronavirus serogroup | (including 
human coronavirus 229E) and _ coronavirus 
serogroup 2 (which includes human coronavirus 
OC43). The minor antigenic serogroup (coro- 
navirus serogroup 3) currently only contains a 
single member, avian infectious bronchitis virus 
(Siddell, 1995). 

All coronaviruses possess three major proteins: 
the nucleocapsid (N), membrane (M) and spike 
(S); a minor protein (sM); with some coro- 
naviruses also possessing a haemagglutinin-es- 
terase (HE) glycoprotein (Siddell, 1995). The spike 
glycoprotein is of particular importance in the 
infectious process because: (a) it is the site for the 
virus anti-receptor (Collins et al., 1982); (b) it has 
fusion activity (De Groot et al., 1989); and (c) it 
contains sites against which major neutralising 
antibodies are directed (Jimenez et al., 1986). The 
composition of the spike glycoprotein is therefore 
very relevant to the ability of the virus to evade 
the hosts’ immune system (La Monica et al., 
1991). 

Human coronaviruses have a world-wide distri- 
bution (Hruskova et al., 1990; Matsumoto and 
Kawana, 1992) and infect all age groups (Gwalt- 
ney, 1980). There is evidence to suggest a role for 
human coronaviruses in the aetiology of enteric 
(Payne et al., 1986), neurological (Stewart et al., 
1992) but, primarily, respiratory disease (Myint, 
1995). Indeed, human coronaviruses are though to 
be responsible for approximately 20% of common 
colds (McIntosh et al., 1970), as well as lower 
respiratory tract infections in infants (McIntosh et 
al., 1974) and the exacerbation of asthma (John- 
ston et al., 1995). Prospective studies have indi- 


cated that such human coronavirus induced 
respiratory infections tend to occur in cycles, with 
a periodicity of approximately 3 years (Monto 
and Lim, 1974). 

Respiratory re-infections with human coro- 
naviruses are common (Monto and Lim, 1974). 
The mechanism facilitating re-infection is, how- 
ever, unclear. Macnaughton (1982) indicated that 
coronavirus antibodies raised against human 
coronavirus 229E strains (serogroup 1) may not 
be protective against human coronavirus OC43 
strains (serogroup 2) and vice versa. The existence 
of pre-existing coronavirus antibody directed to 
the same serotype is not protective against further 
coronavirus infection (Callow, 1985). Natural an- 
tibodies against a particular serotype of coro- 
navirus were protective for approximately four 
months only, after which time re-infection by the 
same serotype of human coronavirus could occur. 

In this study preliminary evidence was obtained 
that significant variation in the S protein of the 
virus that is unlikely to explain the basis of re-in- 
fections. A reverse transcription PCR sequencing 
strategy was developed which allows sequence 
data from the spike genes of several geographi- 
cally and chronologically distinct human coro- 
naviruses 229E to be collated and compared. By 
predicting the corresponding amino acid se- 
quences of these spike genes, it has been possible 
to make a preliminary assessment of the degree of 
variation within the corresponding spike protein 
sequences of these isolates and those published 
previously. 


2. Materials and methods 


2.1. Viruses and cells 


Human coronavirus 229E strain VR-74 was 
purchased from the American Type Culture Col- 
lection, MD, USA. Strain LRI 281 was isolated 
from nasal washings obtained in 1990 from a 
child with asthma at the Leicester Royal Infir- 
mary, Leicester, UK. Strain A162 was isolated 
from nasal secretions obtained in 1995 from an 
adult presenting with the common cold at Ku- 
masi, Ghana, West Africa. All specimens were 
transported to the laboratory on dry _ ice, 
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aliquoted into 100 ul quantities and stored at 
— 70°C until required. 


2.2. Primers 


Spike gene reverse transcription and PCR 
primers were designed from consensus regions of 
the spike genes of several coronaviruses 229E 
utilising published data (Wesseling et al., 1994). 
Spike gene sequencing primers were designed by a 
‘primer walking’ method utilising human coro- 
navirus 229E strain ATCC VR-74 as template. All 
primers were prepared using f-cyanoethyl phos- 
phoramidite (CEP) chemistry at the Protein and 
Nucleic Acid Laboratory at the University of 
Leicester, Leicester, UK. 


2.3. Extraction of human coronavirus 229E RNA 


The extraction of human coronavirus 229E 
RNA was based on guanidium isothiocyanate 
methodology (Chomezynski and Sacchi, 1987) us- 
ing RNAzol B (Biogenesis Ltd, Poole, UK). Once 
extracted, the total RNA pellet was allowed to 
dry for approximately 25 min at room tempera- 
ture and then resuspended in 30 pl of RNAse free 
ultra-high quality (UHQ) water containing 20 U/ 
ul of RNAse inhibitor (Promega). 


2.4. Reverse transcription of human coronavirus 
229E RNA 


All reverse transcription reactions were carried 
out in a final volume of 20 pl. Negative controls 
comprised RNAse free UHQ water which had 
undergone RNA extraction. 

Initially, for each RNA extraction to be reverse 
transcribed, a reverse transcription supermix con- 
taining 2 wl of 10 x MMLYV reverse transcription 
(RT) buffer (Stratagene, Cambridge, UK), 2 ul of 
a 5 mM mix of deoxynucleotide triphosphates 
(dNTPs), 0.5 pl of 100 mM dithiothreitol (Sigma, 
Poole, UK), 1 ul of 10 pg/ml gelatin, 3 ul of UHQ 
RNAse free water and 1 pl of downstream primer 
LPS2 (see Table 1) at a stock concentration of 25 
pmoles was prepared. Of this RT-super mix, 9.5 
ul was then transferred to a labelled 0.5 ml sterile 
RNAse free Eppendorf and overlaid with sterile 


Table 1 
Primers utilised in the human coronavirus 229E spike gene 
RT-nested PCR and subsequent cycle sequencing reactions 


Primer pair Fragment gener- Primer sequences 


ated 


5’ GCCACAG- 
CAACCAGTAGA 3’ 


(1) Reverse transcription primer 


(LPS2) 

(2) Spike gene nested PCR primers 

(LPS1) NA 5’ AATAATTG- 
GTTCCTTCTAAC 3’ 

(LPS2) NA 5’ GCCACAG- 
CAACCAGTAGA 3’ 

(JH1) Fl 5’ TTTGTTGCT- 
TAATTGCTTATGG 
3! 

(JH2) Fl 5’ TTTGCCAAAA- 
GAAAAAGGGC 3’ 

(JH3) F2 5 
CCTTTTTCTTTTG- 
GCAAAG 3’ 

(JH4) F2 5’ CCAT- 
TATAATATTGAG- 
CAC 3’ 

(JHS5) F3 5’ TGCTCAATAT- 
TATAATGG 3’ 

(JH6) F3 5’ ACAA- 


CATAATAGCA 3’ 


(3) Cycle sequencing primers 


(JH1) Fl 5’ TTTGTTGCT- 
TAATTGCTTATGG 
3! 

(JH2) Fl 5’ TTTGCCAAAA- 
GAAAAAGGGC 3’ 

(JH3) F2 By 
CCTTTTTCTITTG- 
GCAAAG 3’ 

(JH4) F2 5’ CCAT- 
TATAATATTGAG- 
CAC 3’ 

(JHS5) F3 5' GTGCTCAATAT- 
TATAATGG 3’ 

(JH6) F3 5’ ACAACAA- 
CATAATAGCA 3’ 

(JH7) Fl 5’ TCTGATGTCAT- 
ACGTTACAACC 3’ 

(JH8) F2 5’ GTAAGTACTAT- 
ACTATAGG 3’ 

(JH9) F3 5’ TCTCATTAG- 
CAATTCAGGC 3’ 

(JH10) Fl 5' TTCAGGTGAT- 
GCTCACAT 3’ 

(JH11) F2 5' ACGTACACAT- 


CAACTTCAGG 3’ 
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Table 1 (continued) 


Primer pair Fragment Primer sequences 
generated 

(JH12) F3 5’ GGATGTTGTTCAT- 
CAACAAG 3’ 

(JH13) Fl 5’ CACTTTAGGTAATG- 
TAGAAGC 3’ 

(JH14) FS 5’ CTATAATTGCTGTT- 
CAACCACG 3’ 

(JH15) F3 5’ TGAGTGTGTCAAA- 
TCCCAG 3’ 

(JH16) F4 5’ TGACCAGTTGTCCT- 
TTIGATGTA 3’ 

(JH17) FS 5’ AGACGCCTTAAGA- 
AATAGCG 3’ 

(JH18) F4 5’ CGTTTATTGTGTTG- 
TACGTTG 3’ 

(JH19) Fl 5’ ATGTGAGCATCAC- 
CTGAA 3’ 

(JH20) Fl 5’ GCTTCTACATTACC- 
TAAAGTG 3’ 

(JH21) Fl 5’ TACATCAAAGGAC- 
AACTGGTCA 3’ 

(JH22) Fl 5’ CAACGTACAACAC- 
AATAAACG 3’ 

(JH23) F4 5’ CCTATAGTATAGTA- 
CTTAC 3’ 

(JH24) F4 5’ CCTGAAGTTGATG- 
TGTACGT 3’ 

(JH25) F2 5’ CGTGGTTGAACAG- 
CAATTATAG 3’ 

(JH26) F2 5’ CGCTATTTCTTAAG- 
GCGTCT 3’ 

(JH27) FS 5’ TGCCTGAATTGCTA- 
ATGAGA 3’ 

(JH28) FS 5’ CTTGTTGATGAACA- 
ACATCC 3’ 

(JH29) F3 5’ CTGGGATTTGACA- 
CACTCA 3’ 

(JH30) F3 5’ GAACCACGTATTCC- 
TACCAT 3’ 

(JH31) F3 5’ TTGACCAGTGAAA- 
TTAGCACCC 3’ 

(JH32) F3 5’ ATGGTAGGAATAC- 
GTGGTTC 3’ 

(JH33) F3 5’ GGGTGCTAATTTC- 


ACTGGTCAA 3’ 


mineral oil (Sigma, Poole, UK). RNA extract (10 
ul) was then added to its respective Eppendorf 
and the resultant RT/RNA mixes heated to 70°C 
for 5 min. After this time, the RT/RNA mixes 


were placed immediately on ice for 5 min and 0.5 
ul of MMLV (Stratagene, Cambridge, UK) re- 
verse transcriptase then added to each reaction 
mix. The reaction mixes were then placed in a 
pre-heated Trio-block thermocycler (Biometra, 
Maidstone, UK) at 37°C for 1 h. After 1 h the 
reverse transcription/RNA mixes were heated to 
95°C for 5 min and then cooled to 4°C prior to 
use in the human coronavirus 229E nested spike 
gene PCR. 


2.5. Human coronavirus 229E spike gene nested 
PCR 


All first and second round PCR reactions were 
carried out in a final volume of 50 ul. All stock 
primers had a concentration of 25 pmoles. 


2.5.1. First round PCR protocol 

Initially, a first round PCR supermix was pre- 
pared containing 31.6 pl of sterile UHQ water, 5 
ul of 10 x Thermus icelandicus PCR buffer (Ad- 
vanced Biotechnologies, Leatherhead, UK), 6 pl 
of 25 mM magnesium chloride, 0.4 wl of a 5 mM 
mix of dNTPs, | ul of primer LPS1 (see Table 1) 
and 1 ul of primer LPS2 (see Table 1) per reverse 
transcribed specimen to be PCR amplified. This 
PCR supermix (44.8 pl) was then pipetted into a 
labelled 0.5-ml Eppendorf and overlaid with ster- 
ile mineral oil. Reverse transcribed human coro- 
navirus 229E spike gene cDNA (or negative 
control cDNA) (5 ul) was then added to its re- 
spective Eppendorf and the first round PCR reac- 
tion mixes transferred to a pre-heated (95°C) 
Trioblock thermocycler and subjected to a ‘hot 
start’ and ‘touchdown’ PCR protocol with 0.2 ul 
‘Red Hot’ Thermus icelandicus DNA polymerase 
(Advanced Biotechnologies, Leatherhead, UK). 
The initial phase consisted of 20 cycles of 92°C 
for 30s, thermal ramp to 65°C for | min, thermal 
ramp to 72°C for 4 min then thermal ramp to 
92°C. This was followed by ten cycles of 92°C for 
30 s, thermal ramp to 55°C for 1 min, thermal 
ramp to 72°C for 4 min, then a thermal ramp to 
92°C. PCR products were then cooled to 4°C and 
stored until second round reaction mixes had been 
prepared. 
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Human Coronavirus 229E Spike Gene 


Fg ee 


JH16 <q——______»  JH24 


¥F4 


Key:- F= Fragment 


JH14 ~<q—_______» 1 JH28 
FS 


Fig. 1. Schematic representation of the amplification products generated using the human coronavirus 229E spike gene nested PCR. 


2.5.2. Second round PCR protocol 

In the second round of the human coronavirus 
229E spike gene PCR, 5 ‘second round reaction 
mixes’ were prepared for each of the first round 
PCR amplification products to be re-amplified. 
Initially, a second round supermix containing 33 
ul of sterile UHQ water, 5 ul of 10 x Thermus 
icelandicus PCR buffer (Advanced  Biotech- 
nologies), 6u1 of 25 mM magnesium chloride and 
0.4 wl of a 5 mM mix of dNTPs was prepared for 
each first round PCR amplification undertaken. 
This second round supermix (44.4 ul) was then 
aliquoted into labelled sterile 0.5-ml Eppendorfs. 
Next, five separate ‘primer pair mixes’ were 
prepared containing either (i) 0.2 ul of primer JH1 
and 0.2 wl of primer JH2; (ii) 0.2 ul of primer JH3 
and 0.2 ul of primer JH4; (iii) 0.2 pl of primer 
JHS and 0.2 wl of primer JH6; (iv) 0.2 ul of primer 
JH16 and 0.2 pl of primer JH24; and (v) 0.2 ul of 
primer JH14 and 0.2 ul of primer JH28 (see Table 
1) per first round PCR amplification undertaken. 
Each of these primer pair mixes (0.4 pl) was then 
added to their respective second round supermix 
aliquot and the resulting ‘complete mixes’ 
overlaid with sterile mineral oil. A 1:10 (v/v) 
dilution of the first round amplification products 
was then prepared in sterile UHQ water and 5 pl 
of the resultant PCR product dilution added to 
each of its five respective complete second round 
reaction mixes. Round 2 PCR reaction mixes were 


then transferred to a pre-heated (95°C) Trioblock 
thermocycler and subjected to a ‘hot start’ and 
‘touchdown’ PCR protocol as already described 
for the first round. After completion of this 
second round PCR cycling regime, PCR products 
were cooled to 4°C and amplified PCR products 
observed by gel electrophoresis and ethidium 
bromide staining. 

Fig. 1 indicates schematically the region of the 
human coronavirus 229E spike gene amplified by 
this reverse transcription and nested PCR 
protocol. 


2.6. Cycle sequencing protocol 


Human coronavirus 229E spike gene PCR 
product sequencing was undertaken using the 
PRISM™ di-deoxy terminator cycle sequencing 
kit (Applied Biosystems, Foster City, USA). Se- 
quencing of both the sense and antisense strands 
of the human coronavirus 229E spike gene PCR 
DNA was undertaken, with some sequencing 
primers being used more than once to increase the 
accuracy of generated sequence data at a particu- 
lar locus. 


2.6.1. Cleaning second round PCR products 

Prior to sequencing, amplified human coro- 
navirus 229E PCR products were cleaned using 
‘Qiaquick’ spin columns as detailed by the manu- 
facturer (Qiagen, Hilden, Germany). 
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2.6.2. Cycle sequencing 

Each individual cycle sequencing mix contained 
8 wl of pre-prepared dye-terminator mix, 0.05-0.1 
ug cleaned human coronavirus 229E second 
round PCR product, 3.2 pmol of the relevant 
sequencing primer and the correct volume of 
UHQ water to make a total cycle sequencing 
reaction mix volume of 20 ul. Once prepared, the 
cycle sequencing mix was overlaid with mineral 
oil and placed in a pre-heated (96°C) Trioblock 
thermocycler. A cycle sequencing temperature 
regime was then undertaken with 25 cycles of 
96°C for 10 s, thermal ramp to 50°C in 64 s for 5 
s, thermal ramp to 55°C in 30 s for 241 s, then 
thermal ramp to 94°C in 54 s, followed by cooling 
to 4°C. 


2.6.3. Cleaning cycle sequencing products 

The removal of unincorporated nucleotides and 
enzymes from cycle sequencing products was 
achieved using a_ standard phenol/chloro- 
form/isoamyl alcohol extraction and sodium 
acetate (pH 4.5)/ethanol precipitation 
methodology. 


2.6.4. Assimilation of sequence data and 
generation of consensus sequences 

Cleaned cycle sequencing products were run on 
an ABI 373 DNA sequencer (Applied Biosystems, 
Foster City, USA). Resultant chromatograms 
were examined using Sequence Editor™ software 
(Applied Biosystems) and a library of text only 
sequences generated. Each individual human 
coronavirus 229E spike gene library was then 
assembled using AutoAssembler™ — software 
(Applied Biosystems) to generate a human 
coronavirus 229E spike gene consensus sequence 
for that particular human coronavirus 229E 
isolate. 


3. Results 


In total, 33 sequencing primers were designed 
and used to sequence approximately 90% of the 
spike genes of human coronavirus 229E isolates 
ATCC VR-74, LRI 281 and A162 (when com- 
pared to the published human coronavirus 229E 


‘LP’ spike gene sequence Raabe et al., 1990). Six 
of these 33 sequencing primers (i.e. JH1, JH2, 
JH3, JH4, JH5 and JH6) were also used in the 
initial human coronavirus 229E spike gene PCR 
protocol, whilst the remaining 27 primers were 
used as sequencing primers alone. Sequence data 
was collected from both the sense and antisense 
strands of human coronavirus 229E spike gene 
PCR products. 


3.1. Human coronavirus 229E strain ATCC VR-74 


Forty one individual primer sequences were 
used to construct a human coronavirus 229E 
strain ATCC VR-74 consensus sequence (Au- 
toAssembler software) comprising 3122 nucle- 
otides. This 3122-nucleotide consensus sequence 
was assembled from a total library of 13021 
individual nucleotides. Forty nine of these 13021 
nucleotides were deemed to have been included 
via mis-incorporation errors by the Thermus ice- 
landicus and MMLYV reverse transcriptase en- 
zymes (mis-incorporation errors deemed to have 
occurred when the nucleotide at a particular locus 
within the total spike gene assemblage differed 
from that of the same locus in the spike gene 
consensus sequence and where this nucleotide dif- 
ference occurred in either the sense or antisense 
strand only). Similarly, 24 nucleotides within the 
13021 total nucleotide assemblage were deemed 
to contain nucleotide additions and 16 loci nucle- 
otide deletions (data not shown). From this data 
an overall mis-incorporation rate for the human 
coronavirus 229E reverse transcription nested 
spike gene PCR of 0.7% was calculated. Fig. 2. 
shows the amino acid sequence obtained upon 
translation of the human coronavirus 229E strain 
ATCC VR-74 spike gene consensus sequence. 


3.2. Human coronavirus 229E LRI 281 


Forty six individual primer sequences were em- 
ployed to construct a human coronavirus 229E 
LRI S gene 281 consensus sequence. A different 
number of primers were used for this strain be- 
cause the read length from individual sequences 
varied with some reactions not generating the 
required number of bases. The resultant 3139 
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1 50 
LP MFVLLVAYAL LHIAGCQTIN GLNTSYSVCN GCVGYSENVF AVESGGYIPS 
ATCC VR-74 
LRI 281 
A162 
51 100 
LP DFAFNNWFLL TNISSVVDGV VRSFQPLLLN CLWSVSGLRF TIGFVYFNGT 
ATCC VR-74 pees 
LRI 281 }S. 
A162 eee 
* 
101 150 
LP GRGDCKGFSS DVLSDVIRYN LNFEENLRRG TILFKTSYGV VVFYCTNNTL 
ATCOVR-94 —oycoee ee a ee eS ee espe: seated Dgihaae Ve Misa ded ot 
ERIQS: eerequues Veeste bie aie g, ae aun tes aedannectes ee 
AIG bite inde sagt iGene SGaevekl ee: saniaceets spine eee ml 
* 3k 
150 200 
LP VSGDAHIPFG TVLGNFYCFV NTTIGNETTS AFVGALPKTV REFVISRTGH 
RTCOVRIA, Lou tne gan eneets, nt haan Sahoo y bt Fit ac erecta 
ERI281 5 ydse dans Ge Sin deendiasta> 2iligrsiubenatines, aS cha aaa is Gaeta ae 
AVGD.-” vied Seta hived ae pincow nae 2 Naa Se seia acd. uaa biaen 
201 250 
LP FYINGYRYFT LGNVEAVNFN VITAETTDFC TVALASYADV LVNVSQTSIA 
ATCC VR-74. we eee iecieaa ote W acggepoee we Sheree wmacdewess 
ERIOSI; Gos eaciese Gk bes ohn aa ees ik eee ors ate 
RAGE. Wp es Cie Sasa ccna tecanre Ne tyes’ Waal watts aSrae eet sade 
ae * eK OK * 
251 300 
LP NIIYCNSVIN RLRCDQLSFD VPDGFYSTSP IQSVELPVSI VSLPVYHKHT 
ATCCVRTA oesuaicg (ey aeeeaeien es (epee. eae ; peed nchary 
TRE 28E red ash shes Sols ste tan uke Habuow ak, - hes Nee tb ig woth tory 
WIG). “edawcreehns Mite biteke, seen Gale avads ee ae 2 


Fig. 2. Comparison of the predicted amino acid sequence for he spike proteins of several human coronaviruses 229E. LP, human 
coronavirus 229E isolate LP (Raabe et al., 1990); ATCC VR-74, human coronavirus 229E ATCC VR-74 (EMBL Accession No. 
Y09923); LRI 281, human coronavirus 229E isolate LRI 281 (EMBL Accession No. Y10052); A162, human coronavirus 229E 
isolate A162 (EMBL Accession No. Y10051); ---, regions of homology between translated spike protein sequences; *, spike protein 
loci with an absence of conservation; }{, spike gene region sequences. 
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186 
350 


301 
LP FIVLYVDFKP QSGGGKCFNC YPAGVNITLA NFNETKGPLC VDTSHFTTKY 


ey 


ATCOVR=T4:, a) asshetess gies Gola SERS Res 
ERI QB Y, 6celbe esis, ies ieitd Be se SE Oe eels eke”, Qual e ele Stee! eaneetetess 
Al62) ... ee N.L R..V.R.Y Ree Vielenee See en wh De Oa Wethaeies a4 QF 

* * Ox * FO * * eK 


351 400 
LP VAVYANVGRW SASINTGNCP FSFGKVNNFV KFGSVCFSLK DIPGGCAMPI 


eee me ee eee ee eee eer eee ce eee renee seer eer eves 


ATCC VR-74 ow. ee eee 
ERL281) chctedendiess Se leeatetied, Wet bea WS we a Ee de RR eet 
A162: >.Ges2.eKED ice Wh stew stecate a Se Ba: ARR ee ea. weal’ 
ORK ORK 
401 450 
LP VANWAYSKYY TIGSLYVSWS DGDGITGVPQ PVEGVSSFMN VTLDKCTKYN 
ATCC VR:74) sicehc ewes het teadal eye aleens: ctledtewaeee) Sem nees 
ERD280 view deied ada ti@edcen, Getiaveeis. ottedas dices Avswedcarege-giais 
Al62. M..L.NLNSH N......... (ee Ves ees Keg etet Peaches etaNeameae 
* * TR ROK RO * * * * * 
451 , 500 
LP IYDVSGVGVI RVSNDTFLNG ITYTSTSGNL LGFKDVTKGT  IYSITPCNPP 
ATCOVR74. fehecigeeas! kee tat Sheltie Zatectalies. Siadeiudears 
ERL280° poseteunbewe eae eee Se Rae Mites ete die eae GiaeG Sees 
A162 cb delet ax ates Lesaveguiss adie wed Veivias Nac atte ss 
* * * 
501 550 
LP DQLVVYQQAV VGAMLSENFT SYGFSNVVEL PKFFYASNGT YNCTDAVLTY 
ATCGVREN4:) aie sk eet, Gavealeewae SE eae howe ewe, Seated 
ERDQ81-  Secuetrew eae wena een Sy wea eaten anes ie. rca te 
A162) -. aiéainimetaee owerdeted eee: we ka erred Mi Sota whew sk. Jacredeacg fat 
* 
551 600 
LP SSFGVCADGS IIAVQPRNVS YDSVSAIVTA NLSIPSNWTT SVQVEYLQIT 
ATCO NRT4 2 titwe-niat viel” coe atieonine  Apitrntnies” tet ke au tales Mabel 
ERT 281! sade Sain. gvdiisiesiak chuiiewuditbar olisihiioudl. <uadtessaad 
162; w.tTonust sores ebhelela Seta, <b dteneOacm. sets esc. anata 


Fig. 2. (Continued) 
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601 650 
STPIVVDCST YVCNGNVRCV ELLKQYTSAC KTIEDALRNS ARLESADVSE 


Dd 


CC 


651 700 
MLTFDKKAFT LANVSSFGDY NLSSVIPSLP TSGSRVAGRS  AIEDILFSKL 


Cd 


Co SS 


701 750 


i y 
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Sequencing Fragment Used 


Primer as Template Primer Sequence 

JH1 Fi § TTTGTTGCTTAATTGCTTATGG 3’ 
JH2 Fi 5’ TTTGCCAAAAGAAAAAGGGC 3’ 
JH3 F2 5 CCTTTTTCTTTTGGCAAAG 3 
JH4 F2 5’ CCATTATAATATTGAGCAC 3 
JH5 F3 5’ GTGCTCAATATTATAATGG 3 
JH6 F3 5’ ACAACAACATAATAGCA 3 
JH7 Fi 5’ TCTGATGTCATACGTTACAACC 3’ 
JH8 F2 5’ GTAAGTACTATACTATAGG 3 
JH9 F3 5’ TCTCATTAGCAATTCAGGC 3 
JH10 Fi 5’ TTCAGGTGATGCTCACAT 3 
JH11 F2 5’ ACGTACACATCAACTTCAGG 3 
JH12 F3 5’ GGATGTTGTTCATCAACAAG 3 
JH13 Fi 5’ CACTTTAGGTAATGTAGAAGC 3’ 
JH14 F5 5’ CTATAATTGCTGTTCAACCACG 3’ 
JH15 F3 5’ TGAGTGTGTCAAATCCCAG 3 
JH16 F4 5’ TGACCAGTTGTCCTTTGATGTA 3 
JH17 F5 5’ AGACGCCTTAAGAAATAGCG 3’ 
JH18 F4 5’ CGTTTATTGTGTTGTACGTTG 3 


Sequencing Fragment Used 


Primer as Template Primer Sequence 

JH19 Fi 5’ ATGTGAGCATCACCTGAA 3 
JH20 Fi 5’ GCTTCTACATTACCTAAAGTG 3 
JH21 F1 5’ TACATCAAAGGACAACTGGTCA 3’ 
JH22 Fi 5’ CAACGTACAACACAATAAACG 3’ 
JH23 F4 5’ CCTATAGTATAGTACTTAC 3 
JH24 F4 5’ CCTGAAGTTGATGTGTACGT 3 
JH25 F2 5’ CGTGGTTGAACAGCAATTATAG 3’ 
JH26 F2 5’ CGCTATTTCTTAAGGCGTCT 3 
JH27 F5 5’ TGCCTGAATTGCTAATGAGA 3 
JH28 F5 5’ CTTGTTGATGAACAACATCC 3 
JH29 F3 5’ CTGGGATTTGACACACTCA 3 
JH30 FS 5’ GAACCACGTATTCCTACCAT 3 
JH31 F3 5’ TTGACCAGTGAAATTAGCACCC 3’ 
JH32 Fs 5’ ATGGTAGGAATACGTGGTTC 3 
JH33 F3 5’ GGGTGCTAATTTCACTGGTCAA 3’ 


Fig. 3. Failed sequencing primers and primer binding regions on the human coronavirus 229E A162 spike gene. 


nucleotide consensus sequence was assembled 
from a total library of 14946 individual nucle- 
otides. Misincorporation errors were deemed to 
have occurred at 25 positions within this 14946 
total nucleotide assemblage, with 29 positions 
deemed to contain nucleotide additions and 29 
positions nucleotide deletions (data not shown). 
Fig. 2 shows the amino acid sequence obtained 
upon translation of the human coronavirus 229E 
LRI 281 spike gene consensus sequence. 


3.3. Human Coronavirus 229E A162 


Forty one individual primer sequences were 
employed to construct a human coronavirus 229E 
A162 consensus sequence some 3046 nucleotides 
in length. This 3046 nucleotide consensus sequence 
was constructed from 13974 individual nucle- 
otides. Mis-incorporation errors were deemed to 
have occurred at 31 positions within this 13974 
total nucleotide assemblage, with 21 positions 
deemed to contain nucleotide additions and 36 
positions nucleotide deletions (data not shown). 
Fig. 2. shows the amino acid sequence obtained 
upon translation of the human coronavirus 229E 
A162 spike gene consensus sequence. 


During the sequencing of the human coro- 
navirus 229E A162 spike gene it was found that 
sequence data could not be obtained from three of 
the 33 sequencing primers used successfully in the 
human coronavirus 229E strain ATCC VR-74 and 
human coronavirus 229E LRI 281 spike gene 
sequencing protocols. Upon assembly of the hu- 
man coronavirus 229E A162 spike gene consensus 
sequence, however, it was determined that nucle- 
otide differences within the human coronavirus 
229E A162 spike gene consensus sequence (as 
compared to the spike gene consensus sequences 
of human coronavirus 229E strain ATCC VR-74 
and human coronavirus 229E LRI 281) may have 
affected the primer binding sites for these particu- 
lar sequencing primers. Fig. 3 shows the effect of 
human coronavirus 229E A162 spike gene se- 
quence changes on the primer binding capacity of 
these three failed sequencing primers. 


4. Discussion 
In this study touchdown PCR methodology was 


used due to the fact that sequentially decreasing 
the annealing temperature (from a preset maxi- 
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mum to a preset minimum) allows PCR amplifica- 
tion reactions to be attempted using a wide range 
of PCR primers even if these primers are some- 
what mis-matched with regard to their predicted 
annealing temperatures (Roux, 1994). Also, it was 
envisaged that the use of a touchdown PCR 
methodology would help to circumvent any non- 
specific priming (Don et al., 1991); this was con- 
sidered particularly important because the human 
coronavirus 229E spike gene PCR primer design 
protocol utilised limited the number of possible 
primer sequences available for spike gene amplifi- 
cation. Touchdown PCR methodology did not 
inhibit, however, the production of extraneous 
PCR products during the first round of the hu- 
man coronavirus 229E spike gene PCR. In effect, 
this meant that a nested PCR methodology had to 
be employed. 

Extraneous first round PCR products were pro- 
duced in such quantities that after addition of 
‘neat’ first round PCR product to second round 
reaction mixes and subsequent second round PCR 
cycling, the extraneous first round PCR products 
could still be detected in the background of sec- 
ond round PCR products. More importantly, this 
background of extraneous first round PCR prod- 
ucts tended to interfere with subsequent sequenc- 
ing reactions, with the result that sequence data 
could not be obtained from second round PCR 
mixes to which neat first round PCR products 
had been added. By diluting the first round PCR 
products 1:10 (v/v) in sterile UHQ water, this 
problem was overcome and sequence data could 
be readily obtained. 

An automated sequencing methodology was 
chosen over manual methods due to the relatively 
large number of sequencing reactions required to 
generate human coronavirus 229E spike gene con- 
sensus sequences and the relatively high through- 
put rates achievable with automated sequencing. 

The use of automated dye-terminator sequenc- 
ing chemistry per se was favoured over the use of 
automated dye-primer sequencing chemistry due 
to the fact that: (a) unadulterated PCR (or se- 
quencing) primers can be utilised in dye-termina- 
tor automated sequencing chemistry without the 
need to label the primers with fluorescent dye tags 
(greatly reducing costs); (b) dye-terminator se- 
quencing reactions are carried out in a single tube 
whilst dye-primer sequencing reactions are carried 


out in four separate reaction tubes (again reduc- 
ing sequencing costs); (c) false termination prod- 
ucts are not detected using dye-terminator 
chemistry as a labelled dye-terminator must be 
incorporated into the DNA chain in order for the 
DNA to be detected; and (d) dye-terminator se- 
quencing chemistry requires lower concentrations 
of template DNA allowing several sets of se- 
quence data to be generated with several different 
sequencing primers from a single human coro- 
navirus 229E PCR reaction (DNA Sequencing: 
Chemistry Guide, 1995). 

Automated dye-terminator sequencing does 
however have its drawbacks. In particular, both a 
reduced sequencing accuracy rate (Naeve et al., 
1995) and the generation of artifactual chro- 
matogram peaks (Parker et al., 1995) have been 
associated with the use of such chemistry. Indeed, 
Parker et al. (1995) indicated that the use of 
automated dye-terminator sequencing chemistry 
resulted in chromatogram artifactual peaks whose 
presence was determined by the nucleotide se- 
quence immediately 5’ (1.e. downstream) to the 
artefact itself. Interestingly, it was noted that the 
nucleotide sequences immediately 5’ to chro- 
matogram artefactual peaks in this project did not 
correspond with the 5’ nucleotide sequences gen- 
erating artefactual peaks indicated by Parker et al. 
(1995) (even though automated dye-terminator 
sequencing chemistry was utilised in both 
projects). This apparent discrepancy was brought 
about by the fact that Parker et al. utilised Ampli- 
taq enzyme (Perkin Elmer) in their dye-terminator 
sequencing reactions, whereas in this project Am- 
plitaq FS (FS, fluorescent sequencing) enzyme 
(Perkin Elmer) was utilised. The conclusion there- 
fore being that the pattern of chromatogram arte- 
factual peaks generated in  dye-terminator 
sequencing chemistry reactions by these two en- 
zymes is somewhat different. 

Taking the comments made about the method- 
ology into account, the results from the sequenc- 
ing of the spike genes of the human coronavirus 
229E isolates presented in this paper suggest that 
only minor spike protein variation exists between 
these chronologically and geographically distinct 
isolates. This appears to contrast with another 
member of the same coronavirus serogroup as 
human coronaviruses 229E, canine coronavirus 
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(Horsburgh and Brown, 1995) in which hetero- 
geneity was distributed throughout the spike gene 
sequences of two geographically distinct (one 
British and one American) isolates. Also, a high 
degree of genome recombination events has been 
observed during mixed infections with different 
murine coronavirus strains (Lai et al., 1985) and 
different avian infectious bronchitis virus strains 
(Kusters et al., 1990). Moreover, there is evidence 
to suggest that in the absence of selection pressure 
these recombination events occur in a random 
manner (Banner and Lai, 1990). As only minor 
changes are seen in the translated spike protein 
sequences obtained in this project, it may be 
possible that during human coronavirus 229E in- 
fections selection pressure plays a dominant role 
in limiting the degree of spike gene variation 
(alternatively, multiple human coronavirus 229E 
infections may be a relatively rare event). 

One possible explanation for the degree of ho- 
mology observed between the translated spike 
protein sequences of human coronavirus 229E 
‘LP’ and human coronavirus 229E strain ATCC 
VR-74 relies upon the fact that both of these 
viruses were adapted to tissue culture by serial 
passaging. In particular, MRC5 (Medical Re- 
search Council No. 5) human embryonic lung 
fibroblasts were the final replicative host for hu- 
man coronavirus 229E strain ATCC VR-74, 
whilst Clone 16 human embryonic lung fibroblasts 
(the 16th clone of heteroploid MRC-c cells 
Philpotts, 1983) were the final replicative host for 
human coronavirus 229E ‘LP’ (Raabe et al., 
1990). As MRC5 and MRC-c cells are closely 
related, it may feasible that adaptation of human 
coronaviruses 229E to these similar cell lines may 
facilitate similar spike protein conformations. 
These similar spike protein conformations would 
be expected to have similar protein sequences and 
by reverse translation, similar spike gene se- 
quences. Further, if such ‘convergent evolution’ 
does indeed occur, then it is possible that the 
spike gene sequences obtained from serially pas- 
saged, tissue culture adapted human _ coro- 
naviruses 229E may differ from the spike gene 
sequences of the original isolates. In this case, the 
accuracy of spike gene sequence data from human 
coronaviruses 229E which had been isolated by 


serial passage in tissue culture may be called into 
question. In this project, spike gene PCR amplifi- 
cation of human coronavirus 229E isolates LRI 
281 and A162 was undertaken directly from clini- 
cal specimens. 

Though comparatively few changes were ob- 
served between the human coronavirus 229E 
translated spike proteins compared in this project, 
the majority of such changes were observed in the 
5’ half of the spike protein sequence. This point is 
illustrated most obviously upon examination of 
the translated spike protein sequence of human 
coronavirus 229E A162, where three apparent 
clusters of nucleotide variation were observed, all 
in the 5’ half of the translated spike protein 
sequence. Interestingly, research by Banner et al. 
(1990), working with mouse hepatitis virus, indi- 
cated that the 5’ end of the murine coronavirus 
spike gene may be the preferred site for RNA 
recombination events for this particular coro- 
navirus. Moreover, comparisons of the spike 
protein sequences of avian infectious bronchitis 
virus, feline infectious peritonitis virus, murine 
hepatitis virus and transmissible gastro-enteritis 
virus have indicated that the 3’ (C-terminal) por- 
tion of the spike protein is rather more conserved 
than the corresponding 5’ (amino-terminal por- 
tion) portion of the spike gene (Cavanagh, 1995). 

In theory, it is possible that minor variations in 
the translated sequences of human coronavirus 
229E spike proteins may facilitate a relatively 
major change to the antigenicity of the spike 
protein. However, comparison of the predicted 
antigenic indices of the translated spike proteins 
of human coronaviruses 229E strain ATCC VR- 
74, LRI 281 and A162 (data not shown) using the 
‘peptidestructure’ and ‘plotstructure’ computer 
applications (Genetics Computer Group Inc. 
v8.0., Madison WI, USA) showed that no major 
changes in predicted antigenic indexes were ob- 
served between the translated spike protein se- 
quences of the aforementioned human 
coronavirus 229E isolates. 

Taken as a whole, these results indicate that 
variation in the spike proteins of chronologically 
and geographically distinct human coronaviruses 
229E may be rather limited. Such an interpreta- 
tion would tend to suggest that spike protein 
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variation does not play a major role in the aetiol- 
ogy of human coronavirus 229E re-infection. 
However, in order to assess fully the role that 
spike protein variation has upon human coro- 
navirus 229E re-infection, further work is re- 
quired. In particular, sequencing of the spike 
genes from other chronologically and geographi- 
cally distinct human coronaviruses 229E should 
be undertaken. Further, by cloning and expressing 
the spike proteins of human coronaviruses 229E 
in vitro, it may be possible to assess the role of the 
immune system in the aetiology of human coro- 
navirus 229E re-infection. 
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